Ranking Cases with Decision Trees: a Geometric Method that Preserves Intelligibility
نویسندگان
چکیده
This paper proposes a new method to rank the cases classified by a decision tree. The method applies a posteriori without modification of the tree and doesn’t use additional training cases. It consists in computing the distance of the cases to the decision boundary induced by the decision tree, and to rank them according to this geometric score. When the data are numeric it is very easy to implement and efficient. The distance-based score is a global assess, contrary to other methods that evaluate the score at the level of the leaf. The distance-based score gives good results even with pruned tree, so if the tree is intelligible this property is preserved with an improved ranking ability. The main reason for the efficacity of the geometric method is that in most cases when the classifier is sufficiently accurate, errors are located near the decision boundary.
منابع مشابه
Keep the Decision Tree and Estimate the Class Probabilities Using its Decision Boundary
This paper proposes a new method to estimate the class membership probability of the cases classified by a Decision Tree. This method provides smooth class probabilities estimate, without any modification of the tree, when the data are numerical. It applies a posteriori and doesn’t use additional training cases. It relies on the distance to the decision boundary induced by the decision tree. Th...
متن کاملRanking Cases with Classification Rules
Many real-world machine learning applications require a ranking of cases, in addition to their classi cation. While classi cation rules are not a good representation for ranking, the human comprehensibility aspect of rules makes them an attractive option for many ranking problems where such model transparency is desired. There have been numerous studies on ranking with decision trees, but not m...
متن کاملA robust aggregation operator for multi-criteria decision-making method with bipolar fuzzy soft environment
Molodtsov initiated soft set theory that provided a general mathematicalframework for handling with uncertainties in which we encounter the data by affix parameterized factor during the information analysis as differentiated to fuzzy as well as bipolar fuzzy set theory.The main object of this paper is to lay a foundation for providing a new application of bipolar fuzzy soft tool in ...
متن کاملMeasuring DMU-efficiency by modified cross-efficiency approach
A fundamental weakness of the Data Envelopment Analysis (DEA) is its weak discrimination in cases when a small number of decision making units are compared. Therefore, in such cases the basic DEA model (optimistic and pessimistic) is used in combination with other methods or additional constraints are added to the model. In this paper, the cross-efficiency method was combined with a self-rankin...
متن کاملLearning to Rank Cases with Classification Rules
An advantage of rule induction over other machine learning algorithms is the comprehensibility of the models, a requirement for many data mining applications. However, many real life machine learning applications involve the ranking of cases and classification rules are not a good representation for this. There have been numerous studies to incorporate ranking capability into decision trees, bu...
متن کامل